Mixture of Probabilistic Linear Regressions for Voice Conversion

نویسندگان

  • Yu QIAO
  • Daisuke SAITO
  • Nobuaki MINEMATSU
چکیده

あらまし 本論文では二つの特徴空間の写像を学習する確率的線形回帰混合モデル(MPLR)を提案する。MPLRは 複数の確率的線形回帰モデルを重み付きで混合することで構成されており、そのパラメータは行列計算によって推定 可能である。MPLRは混合モデルであるため、非線形写像を取り扱う事ができる。またMPLRは一般化された定式 化であるため、確率密度として特定のモデルを要求しない。よく知られている GMMを用いた音声変換法 [1], [2]は MPLRの特別な場合と解釈でき、MPLRによる一般化によって、GMMに基づく音声変換法を改良することが可能と なる。[1]に対しては、MPLR の定式化を用いることで、複雑な一次方程式の解探索を避け、より高速なパラメータ推 定が可能になる。更にMPLRは [2]に存在する暗黙の問題を解決する事ができる。我々は音声変換タスクで提案手法 と従来のGMM法について評価実験を行った。様々なパラメータ設定において実験を行った結果、MPLR法は従来法 に対してより良い性能を示した。 キーワード 空間写像、非線形写像、混合モデル、線形回帰、音声変換

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Mixture of Probabilistic Linear Regressions for Voice Conversion

The objective of voice conversion is to transform the voice of one speaker to make it sound like another. The GMM-based statistical mapping technique has been proved to be an efficient method for converting voices [1, 2]. In a recent work [3], we generalized this technique to Mixture of Probabilistic Linear Regressions (MPLR) by using general mixture model of source vectors. In this paper, we i...

متن کامل

Personalizing a speech synthesizer by voice adaptation

A voice adaptation system enables users to quickly create new voices for a text-to-speech system, allowing for the personalization of the synthesis output. The system adapts to the pitch and spectrum of the target speaker, using a probabilistic, locally linear conversion function based on a Gaussian Mixture Model. Numerical and perceptual evaluations reveal insights into the correlation between...

متن کامل

Probabilistic Voice Conversion Using Gaussian Mixture Models

This paper explores the topic of voice conversion as explored in a joint project with Percy Liang (EECS, Berkeley). For our purposes, voice conversion is the process of modifying the speech signal of one speaker (source) so that it sounds as thought it had been pronounced by a different speaker (target). By using a Gaussian mixture model (GMM) to model the features of the source speaker, we can...

متن کامل

CELP Coder Modification for the Voice Conversion

Voice Conversion (VC) consists in modifying a source voice to a target speaker voice. In our approach, we modified only the Code excited linear Predictive (CELP) coder by introducing a pre-processing before the coder for the voice conversion. The decoder part of CELP was not modified. This allows maintaining the transmission rate. Our approach for conversion consists in separating the voiced an...

متن کامل

Novel Radial Basis Function Neural Networks based on Probabilistic Evolutionary and Gaussian Mixture Model for Satellites Optimum Selection

In this study, two novel learning algorithms have been applied on Radial Basis Function Neural Network (RBFNN) to approximate the functions with high non-linear order. The Probabilistic Evolutionary (PE) and Gaussian Mixture Model (GMM) techniques are proposed to significantly minimize the error functions. The main idea is concerning the various strategies to optimize the procedure of Gradient ...

متن کامل

Continuous probabilistic transform for voice conversion

Voice conversion, as considered in this paper, is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). Our contribution includes the design of a new methodology for representing the relationship between two sets of spectral envelopes. The proposed method is based on the use of a Gaussian mi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009